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We derive an automatic procedure for generating a set of highly localized, non-orthogonal orbitals 
for linear scaling quantum Monte Carlo calculations. We demonstrate the advantage of these orbitals 
in calculations of the total energy of both semiconducting and metallic systems by studying bulk 
silicon and the homogeneous electron gas. For silicon, the improved localization of these orbitals 
reduces the computational time by a factor five and the memory by a factor of six compared to 
localized, orthogonal orbitals. For jellium, we demonstrate that the total energy is converged for 
orbitals truncated within spheres with radii 7-8 r s , opening the possibility of linear scaling QMC 
calculations for realistic metallic systems. 



In recent years, one of the most promising develop- 
ments in the field of electronic structure calculations has 
been the development of algorithms whose cost grows as 
the first power of the system size. Linear scaling variants 
of several electronic structure techniques have been devel- 
oped, including tight-binding0, density functional the- 
ory (DFT)P, |3, 1111110, coupled cluster and quan- 
tum Monte Carlo (QMC) 8, 9]. In all these approaches 
extended Bloch orbitals, ipnk, are transformed into lo- 
calized Wannier-like orbitals. The speedup provided by 
the transformation to localized orbitals depends on the 
extent to which the orbitals can be localized and sub- 
sequently truncated. Therefore, improved methods for 
constructing localized orbitals have attracted intense at- 
tention in recent years 0,0, Of particular relevance 
to this paper is the recent demonstration that generaliz- 
ing from conventional orthonormal Wannier orbitals to 
non- orthogonal orbitalsQ, E H H E| can provide in- 
creased localization and further accelerate linear scaling 
DFT algorithms. 

In QMC calculations, truncated, localized orbitals can 
be used to introduce sparsity into the Slater determinant 
part of the trial wave function. As the calculation of the 
orbitals used to construct this determinant is the domi- 
nant cost of QMC calculations, this transformation yields 
a near linear scaling QMC algorithm. In our original ap- 
proach to linear scaling QMC calculations |8j, the Slater 
determinant was constructed from a set of orthonormal 
Wannier functions. This choice of orbitals produces a 
near linear scaling algorithm, which has successfully been 
applied to calculations of the total energies and optical 
gaps of a variety of semiconductor systems^lQ]. How- 
ever, this method suffers from three main limitations: (i) 
It is only applicable to systems where the Wannier func- 
tions decay rapidly (exponentially), i.e., it works well for 
semiconductors and insulators, but it is not applicable to 
metallic systems where orthonormal Wannier functions 
decay polvnomiallvflU . (ii) The Wannier functions are 
constructed via a unitary transformation of an input set 
of Bloch functions. This limits one to orthogonal func- 
tions, while a non-zero Slater determinant requires only 



that the orbitals are linearly independent, (iii) Truncat- 
ing a localized function reduces the volume in which it 
needs to be evaluated, speeding up the calculation. How- 
ever, the Metropolis algorithm samples configurations of 
electron coordinates from the many-body wavefunction, 
hence, some points are sampled more frequently than 
others. This knowledge of the wavefunction is not in- 
cluded in the generation of the orthonormal orbitals and 
hence the choice of orbitals is not optimal. 

In this letter we derive and demonstrate the use of a 
non- orthogonal transformation of the Bloch orbitals that 
overcomes the above limitations. This transformation is 
based on algorithms developed for linear scaling DFT 
calculations [l], L3- La- LJ- LJ| and is designed to minimize 
a cost function associated with the total number of or- 
bital evaluations required in a linear scaling QMC cal- 
culation. For representative semiconductor systems, the 
orbitals obtained from this non-orthogonal transforma- 
tion are significantly more localized and smoother than 
orthogonal Wannier functions, and can typically be trun- 
cated in one sixth of the volume of the equivalent orthog- 
onal function without sacrificing accuracy. This produces 
an algorithm ~5 times faster than previous linear scaling 
QMC calculations jg| and requiring one sixth of the mem- 
ory. In addition, we demonstrate that while orthogonal 
Wannier functions for metallic systems cannot be trun- 
cated within a practical volume, non-orthogonal orbitals 
constructed via our procedure can be truncated within a 
practical cutoff radius. 

Our QMC calculations use a linear scaling^ version 
of the CASINO [15j code with a standard Slater- Jastrow 
trial wavefunction, \1/t(R)^1|. The Slater determinants 
are constructed from a set of truncated, localized linearly 
independent orbitals Dij — (f>i(rj)&i(Tj), where <f> are 
are the non-orthogonal orbitals and O are the truncation 
functions. In principle, one can optimize the shape of the 
truncation functions, however, for the systems studied 
here, we find that spherical step functions are a simple 
and stable solution where, 
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The truncation functions O l are defined by two param- 
eters, the cutoff radii i?™' and the centers Ri. These 
parameters are optimized iteratively using a procedure 
designed to minimize the computational cost of the QMC 
calculation. The non-orthogonal orbitals, <f>i, associated 
with each 6* are obtained during the iterative process. 

The computational cost of a typical QMC calculation 
is proportional to the number of orbital evaluations re- 
quired to construct the Slater determinant for each con- 
figuration of electron coordinates, R = (ri, r2...r/v)- The 
cost is therefore the product of the probability, |vf' T (R)| 2 , 
of sampling a given configuration, R and the cost of eval- 
uating each of the non-zero elements in the Slater deter- 
minant produced by that configuration. For each ele- 
ment, <pi(rj), if Yj falls within the truncation function, 
0^, this adds 1 to the cost, i.e. 
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Cost= / dR |* T (R)| 2 J]e j (r j ) 
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By integrating out all but one electron coordinates, 
Eq.© can be expressed in terms of the density p(r) as 



Cost = ^ / drp(r)9i(r) 



(3) 



We find a satisfactory minimum of Eq. © by starting 
from an initial choice of and iteratively updating first 
the cutoff radii R^ ut and then the centers R^. 
(i) Generating Optimal N on- Orthogonal Orbitals and 
Cutoff Radii: Each truncation function, Oi, can be con- 
sidered as a potential, V, acting on the Hilbert space 
of Bloch orbitals. In the inset to Fig^i this potential, 
V, is shown with a blue line. If one constructs the ma- 
trix elements of the Bloch orbitals with this potential, 
V i =< <pf loch \&(r)\<j)i loch >, then the eigenstate 4> % of 
V with the largest eigenvalue is the most localized state 
within the truncation region. This is the orbital with the 
maximum truncated norm, X, defined as 



X= / dr|^(r)| 2 e, ; (r) 



(4) 



Increasing Rf 1 * increases the above value of X, reduc- 
ing the resulting truncation error in the QMC calcula- 
tion, but also increases the computational cost in Eq. © . 
Therefore, we adjust the cutoff radius R? ut to achieve a 
target norm, e.g. X — 0.999. Repeating this diagonal- 
ization procedure for each truncation function <d l gen- 
erates an associated set of non-orthogonal orbitals {4>}. 
This procedure for generating a set of non-orthogonal or- 
bitals associated with a set of truncation regions is sim- 
ilar to those adopted in linear scaling density functional 
calculations[l|,|3J,yi,U4] and recently in a QMC calculation 
of MgO@. Next, we extend this procedure to automati- 
cally optimize the centers of the truncation functions for 
systems where they cannot be guessed a priori. 
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FIG. 1: (Color) Comparison of the norm of orthogonal and 
non-orthogonal localized orbitals in (a) bulk silicon and (b) 
a HEG at the same r 3 . The error bars show the spread in 
norm between the states. The insets compare the shape of 
the orthogonal and non-orthogonal localized orbitals. 



(ii) Updating the Truncation Centers: The cost function 
in Eq. (|3J) can be rewritten as 



Cost = NX + ]T J dv [p(r) - | (r) | 2 ] 9, (r) 



(5) 



where N is the number of orbitals and X is defined in 
Eq.Q. The first term in Eq.|0 NX, cannot be reduced 
without losing accuracy. Therefore the only way to re- 
duce the computational cost is to minimize the second 
term in Eq.lJSJby placing the truncation centers where 
p(r) — \4>i(r)\ 2 is minimum. Since p(r) > |0j(r)| 2 , this 
is minimized in regions where 4>i is most localized and 
therefore closest to p. Therefore for the next iteration, 
we move the truncation centers towards the center of 
mass of the |^(r)| 2 for the current iteration. To ensure 
linear independence, we orthogonalize the set {<jf\ with 
a polar decomposition before calculating this center of 
mass. 

This updated set of truncation functions, O*, with new 
centers, Ri, are then used to generate a new set of non- 
orthogonal orbitals using the procedure in (i) above and 
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(b) Homogeneous Electron Gas (r =2) 
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FIG. 2: (Color) Comparison of the decay of orthogonal , self 
consistent, non-orthogonal and fixed radius, non-orthogonal 
localized orbitals in (a) bulk silicon and (b) HEG (r s — 2). 



the process is repeated. Starting from a random choice 
of centers 10-15 iterations are typically required to find 
a minimum of Eq. |PJ and to converge the centers. If 
one uses a good starting set of centers, such as the cen- 
ters of maximally localized Wannier functions 01 > the 0i 
converge in 1 or 2 iterations. 

To analyze the properties of these new non-orthogonal 
orbitals we first compare their localization and decay 
properties with an equivalent set of orthogonal orbitals. 
We then examine the convergence of the total energy in 
quantum Monte Carlo calculations using these orbitals. 
Comparisons are made for the prototypical semiconduc- 
tor and metal systems, silicon and the homogeneous elec- 
tron gas. 

In Fig.^the norm within a spherical truncation region 
of the orthogonal and non-orthogonal localized orbitals 
are compared as a function of R cu t- Fig- da) compares 
orbitals constructed for bulk silicon. The input states 
were obtained from a 64 atom LDA calculation 0], us- 
ing a norm-conserving Hamann pseudopotential and a 35 
Ry cutoff. The non-orthogonal orbitals were constructed 
using the iterative procedure described above, where the 
desired norm, A, was varied from 0.5 to 0.99999 to con- 
struct the plot. The orthogonal states were obtained 
by performing a final polar decomposition orthogonal- 
ization step. Due to symmetry all the non-orthogonal 
orbitals are equivalent. Within a given radius, the non- 



orthogonal orbitals contain significantly more charge, e.g. 
99.9% of the norm is contained within a sphere of radius 
5.5 au compared to orthogonal orbitals which require an 
11 au sphere to capture the same charge. The origin of 
this dramatically improved localization is shown in the 
inset to Fig.^a) which shows a line plot through the cen- 
ter of the R cu t — 2 orbitals. While the non-orthogonal 
states decay smoothly to zero with minimal oscillations, 
the orthogonal orbitals oscillate around zero for > 5 au 
after initially crossing zero to maintain orthogonality be- 
tween states. While the amplitude of these oscillations 
is small compared to the central peak, the r 2 prefactor 
leaves a significant amount of charge in these oscillations. 

Comparing the orthogonal orbitals shown in Fig. QJa) 
with maximally localized (MLW) orbitals constructed ac- 
cording to Ref.[lTj, which essentially finds the localized 
eigenstates of the e l27rr / L operator, we find the centers of 
our non-orthogonal and orthogonal states are identical 
to the MLW function centers due to symmetry. Addi- 
tionally, the shape and norm convergence of our orthog- 
onal states is almost identical to the MLW functions. It 
therefore appears that the shape of orthogonal localized 
orbitals is relatively insensitive to the choice of operator 
used to localize the states. 

Figure njb) shows orthogonal and non-orthogonal or- 
bitals constructed for the HEG with r s = 2 (same as 
silicon). The input states were the lowest 1935 plane 
waves in a 50 au cubic box. The norm of the orthogonal 
orbitals slowly approaches 1.0 as the radius is increased, 
as would be expected given the slow polynomial decay of 
orthogonal orbitals in metallic svstems|ll|. In contrast, 
the non-orthogonal orbitals rapidly approach 1. For ex- 
ample 99.9% of the norm of the non-orthogonal orbitals 
is contained within a sphere of radius 7 au, while even 
the largest sphere inscribed within the supercell (25 au 
radius) contains only 94% of the norm of the orthogonal 
orbitals. Note, the non-orthogonal orbitals are still less 
localized than those in silicon, where 99.9% of the norm 
is contained within a sphere of radius 5.5 au compared 
to the 7 au required for jellium. As in silicon, the in- 
set plot shows pronounced, long range oscillations in the 
orthogonal orbitals and a much smoother decay of the 
non-orthogonal orbitals with minimal oscillation. 

Figure compares the truncated decay of orthogonal 
and non-orthogonal localized orbitals for bulk silicon and 
the HEG. This is equivalent to the decay of the trace of 
the non-orthogonal density matrixQ, 0, Q ■ Here we de- 
fine the decay as 1 minus the norm contained within a 
sphere of radius R. As expected from Fig.Q Fig.|3shows 
that the non-orthogonal orbitals (black dashed lines) de- 
cay more rapidly than the equivalent orthogonal orbitals 
(red solid lines). Figure [5] also illustrates that to obtain 
maximum localization within a given volume, the non- 
orthogonal orbitals must be adjusted consistently with 
Rcut PJi- The blue dotted line in Fig [2b shows the decay 
of a non-orthogonal orbital optimized to be maximally 
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(b) DMC Truncation error of HEG 
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FIG. 3: (Color) (a) Convergence of DMC total energy of 
bulk silicon with truncation radius for orthogonal and non- 
orthogonal orbitals, (b) Convergence of DMC total energy of 
the HEG as a function of R cu t jr a 



localized in a 9 function with R cu t = 10. The trun- 
cated norm was then evaluated for a range of R values 
while keeping the orbital fixed. For R cu t = 10 the "self- 
consistent" and fixed non-orthogonal orbitals are identi- 
cal. For all other values of R cu t , the orbital optimized 
with R cut — 10 (blue, dotted) is no longer the optimal 
orbital for that choice of function and it therefore has 
a lower truncated norm. 

Our previous work with orthogonal, truncated, local- 
ized orbitals indicated that the norm of these orbitals 
was a good predictor of the truncation error in a QMC 
calculation the total energy. For silicon we found that a 
truncation region large enough to capture 99.9% of the 
norm was sufficient to produce a converged total energy. 
On this basis, the improved decay properties of the non- 
orthogonal orbitals shown in Figs, ri] and |21 suggest that 
these orbitals can be used to perform QMC calculations 
with smaller truncation radii than those used for pre- 
vious orthogonal orbitals, without sacrificing accuracy. 
However, the decay of the localized orbitals in a given 
representation does not predict the truncation error in 
the density matrix and how electron-electron correlation 
will be affected. Therefore, to fully evaluate the proper- 
ties of these non-orthogonal orbitals we have performed 
VMC and DMC total energy calculations of bulk silicon 
and the HEG to compare the convergence of the total 
energy with the truncation radii of the localized orbitals. 

Figure EJa) compares the convergence of the DMC to- 
tal energy of the same 64 atom, bulk silicon system shown 
in Figs, rfla) and Fig. [21 (a), using orthogonal and non- 



orthogonal input orbitals. It shows that the DMC energy 
converges more rapidly using non-orthogonal orbitals. To 
converge the total energy to with 0.01 eV per atom using 
orthogonal orbitals required R cu t — H au 8] while using 
non-orthogonal orbitals, equivalent accuracy can be ob- 
tained with R cu t — 6 au. This results in a factor of 5 
increase in speed and a factor of 6 reduction in memory. 

Figure[3Jb) compares the convergence with R cu t of the 
DMC total energy of a homogeneous electron gas with 
r s = 1,2,5 and 10. In the HEG, the non-orthogonal or- 
bitals for all r s values can be obtained by scaling the 
r s = 1 orbital. The kinetic energy scales as rj 2 . To en- 
able us to plot all values of r s on the same plot, we rescale 
both axes and plot the fractional DMC truncation error, 
defined as Error(i? cut ) = [E(R cut ) — E^/E^ as a func- 
tion of R cut I r s . After this rescaling the convergence plots 
for each value of r s fall on a similar curve. Note, the neg- 
ative truncation error around R cu t /f s = 6 resulting from 
a loss of kinetic energy, due to abrupt truncation of the 
orbitals. This curve shows that the total DMC energy is 
approximately converged for truncation radii of 7 — 8r s . 
These converged values are in excellent agreement with 
the original values from Ceperley and Alder [l^. There- 
fore, while the slower polynomial decay of the density 
matrix of metallic systems requires a larger truncation 
radius to converge the total energy than for semiconduc- 
tors with equivalent density, the above procedure for gen- 
erating non-orthogonal orbitals does allow the localized 
orbitals for metallic systems to be truncated in a prac- 
tical volume for linear scaling calculations. In addition, 
the above procedure for generating these non-orthogonal 
orbitals does not require the high symmetry of the HEG 
and therefore this approach could be equally applied to 
linear scaling DMC calculations of realistic metallic sys- 
tems. 

In conclusion, we derive a simple, automatic pre- 
processing procedure for generating non-orthogonal lo- 
calized orbitals which minimize the total computational 
cost of linear scaling QMC calculations. We demonstrate 
the application of these orbitals to DMC calculations of 
the prototypical semiconductor and metallic systems, sil- 
icon and the HEG. We anticipate that these orbitals may 
also have applications in alternative electronic structure 
techniques such as DFT which also utilize localized or- 
bitals to generate linear scaling. 
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